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The  research  performed  in  this  project  has  examined  the  abilities  of  human  observers 
to  perceive  3D  form  from  different  types  of  optical  structure  within  moving  or 
stationary  visual  images.  The  research  has  been  organized  into  four  general 
problem  areas,  including  the  low  level  detection  of  coherent  motion,  the  analysis 
of  3D  form  from  motion,  the  analysis  of  image  shading  and  texture,  and  the 
identification  of  image  contours.  Our  basic  strategy  in  all  of  these  areas  has 
been  to  identify  the  key  assumptions  of  current  computational  models;  to  test  the 
psychological  validity  of  those  assumptions  using  appropriate  psychophysical 
procedures;  and,  based  on  the  results  of  those  experiments,  to  develop  alternative 
models  that  more  closely  match  the  perceptual  capabilities  of  actual  human  observers. 
In  contrast  to  most  common  amethods  of  3D  image  analysis,  which  are  designed  to 
compute  precise  metrical  descriptions,  our  results  have  shown  that  human  perception 
Is  primarily  concerned  with  more  abstract  aspects  of  object  structure,  such  as 
affine  or  ordinal  properties,  which  are  easier  to  computer  and  are  more  robust  to 
uncontrolled  changes  in  viewing  conditions. 
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The  research  performed  in  my  laboratory  during  the  past  three  years  under  AFOSR 
support  has  been  designed  to  investigate  the  fundamental  mechanisms  involved  in  the 
perceptual  analysis  of  3-dimensional  form  and  motion.  Much  of  this  research  has  been 
specifically  motivated  from  a  computational  perspective.  Our  basic  research  strategy  has  been 
to  identify  the  key  assumptions  of  various  competing  models;  to  empirically  investigate  the 
relative  psychological  validity  of  those  assumptions  using  appropriate  psychophysical 
procedures;  and,  when  necessary,  to  develop  alternative  models  based  on  revised 
assumptions  that  more  closely  match  the  perceptual  capabilities  of  actual  human  observers. 

There  are  several  different  properties  of  optical  structure  from  which  observers  are  able 
to  obtain  information  about  the  structure  of  the  environment.  We  have  tried  in  our  research  to 
consider  a  broad  spectrum  of  issues  involving  the  analysis  and  integration  of  these  different 
sources  of  information,  and  to  identify  any  common  mechanisms  or  strategies  that  may  be 
involved  in  multiple  domains  of  perceptual  processing.  A  brief  summary  of  these  ongoing 
programs  are  described  briefly  below. 


The  Detection  of  Motion 

One  line  of  research  performed  in  my  laboratory  has  been  designed  to  examine  the 
competitive  and  coorperative  interactions  involved  in  the  detection  of  coherent  motion.  For 
example,  Mingolla,  Todd  &  Norman  (1992)  have  recently  investigated  how  a  field  of  moving 
contours  each  contained  within  its  own  small  circular  aperature  can  be  spatially  integrated  to 
produce  a  perception  of  globally  coherent  motion.  Previous  theoretical  analyses  have  shown 
that  for  translatory  motion  it  is  possible  to  determine  the  correct  direction  of  translation  using  an 
intersection  of  constraints  in  velocity  space.  Our  results  suggest,  however,  that  human 
observers  analyze  such  patterns  using  a  simple  vector  average  of  the  component  velocities 
perpendicular  to  each  contour.  We  have  also  obtained  evidence  that  the  vector  averages  in 
different  regions  can  be  compared  at  a  higher  level  of  analysis  to  identify  different  categories  of 
motion  such  as  translation,  rotation  or  divergence. 

A  related  series  of  experiments  by  Kramer  &  Todd  (1990)  and  Todd,  Kramer  &  Norman 
(1992)  has  examined  how  processes  of  spatiotemporal  integration  can  be  used  to  filter 
spurious  correlations  (i.e.,  false  targets)  that  can  arise  from  the  motions  of  densely  textured 
patterns.  In  these  studies  we  measured  the  maximum  displacement  thresholds  for  for  moving 
patterns  that  varied  in  size,  shape,  duration  and  eccentricity,  and  contained  variable  amounts 
of  uncorrelated  noise.  The  results  indicate  that  the  human  visual  system  may  have  multiple 
mechanisms  of  noise  filtering  to  facilitate  the  detection  of  coherent  motion,  including  averaging 
over  space  and  correlations  of  features  cr  motions  over  multiple  time  intervals. 

The  Analysis  of  Structure  from  Motion 

A  second  line  of  research  performed  in  my  laboratory  has  been  designed  to  investigate 
the  visual  perception  of  3-dimensional  structure  from  motion.  During  the  past  decade  there 
have  been  numerous  theoretical  analyses  proposed  in  the  literature  for  computing  an  object's 
3D  form  from  a  sequence  of  projected  images.  One  of  the  primary  results  of  these  analyses  is 
that  a  minimum  of  three  distinct  views  is  required  in  order  to  obtain  an  unambiguous 
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interpretation  of  an  arbitrary  configuration  rotating  in  depth  under  orthographic  projection.  In 
an  effort  to  test  the  psychological  validity  of  this  conclusion,  we  have  performed  numerous 
experiments  using  a  wide  variety  of  stimuli  and  response  tasks  (see  Todd  et  al,  1988;  Todd  & 
Bressan,  1990;  Todd  &  Norman,  1991;  and  Norman  &  Todd,  1992,1993).  The  results  indicate 
that  different  aspects  of  3D  structure  are  judged  with  varying  degrees  of  accuracy,  but  that 
none  of  these  tasks  are  significantly  influenced  by  increasing  the  number  of  distinct  frames  in 
an  apparent  motion  sequence  beyond  two,  if  other  factors  are  held  constant. 

Based  on  these  findings,  Todd  &  Bressan  (1990)  performed  a  mathematical  analysis  to 
determine  what  type  of  information  is  available  from  a  2-frame  apparent  motion  sequence. 
This  analysis  showed  that  an  object's  structure  can  be  specified  from  such  displays  up  to  an 
affine  stretching  transformation  along  the  line  of  sight.  That  is  to  say,  although  there  is  not 
sufficient  information  in  two  views  to  specify  a  unique  euclidean  structure,  the  set  of  possible 
rigid  interpretations  is  constrained  to  a  one  parameter  family  of  structures  that  are  related  to 
one  another  by  an  affine  stretching  transformation.  This  analysis  is  also  consistent  with  the 
psychophysical  evidence  from  actual  human  observers..  Tasks  that  are  theoretically  possible 
based  solely  on  an  analysis  of  affine  structure  (e.g.,  object  discriminations)  invariably  produce 
high  levels  of  performance,  whereas  those  that  are  theoretically  impossible  based  solely  on 
affine  structure  (e.g.,  3D  angle  discriminations)  invariably  produce  low  levels  of  performance. 


The  Analysis  of  Shading  and  Texture 

A  third  line  of  research  performed  in  my  laboratory  has  been  designed  to  investigate  the 
visual  perception  of  3D  shape  from  shading  and  texture.  The  patterns  of  shading  and  texture 
in  an  image  have  several  environmental  determinants  including  the  shape  of  the  depicted 
object,  its  surface  reflectance  properties,  and  its  pattern  of  illumination.  Previous  theoretical 
analyses  have  demonstrated  that  it  is  possible  to  compute  the  local  orientation  of  any  given 
surface  region  if  provided  with  prior  knowledge  about  the  surface  reflectance  and  illumination 
in  that  region.  One  of  the  goals  of  our  research  has  been  to  psychophysically  investigate 
whether  similar  prior  knowledge  is  required  for  actual  human  perception.  Our  results  show 
clearly  that  it  is  not.  For  example,  Mingolla  &  Todd  (1989)  and  Todd  &  Reichel  (1989) 
demonstrated  that  observers’  judgments  of  local  orientation  or  ordinal  depth  for  ellipsoid 
surfaces  are  largely  unaffected  by  changes  in  surface  reflectance  or  the  direction  of 
illumination  as  would  be  reasonable  to  expect  based  on  current  theory.  Similarly,  in  the 
analysis  of  shape  from  texture  or  surface  contours,  Todd  &  Akerstrom  (1987)  and  Todd  & 
Reichel  (1990)  have  shown  that  human  perception  does  not  require  specific  assumptions 
about  the  shapes  of  individual  texture  elements  or  how  the  contours  align  with  the  local  surface 
curvature. 

Much  of  our  most  recent  research  in  this  area  has  focussed  primarily  on  the  particular 
properties  of  surface  structure  that  are  perceptually  specified  by  patterns  shading  and  texture. 
This  work  has  proceeded  in  an  analogous  fashion  to  our  research  on  perceived  structure  from 
motion,  in  that  we  have  designed  a  wide  variety  of  response  tasks  to  probe  different  aspects  of 
local  surface  structure,  including  ordinal  structure  (Todd  &  Reichel,  1989;  Reichel  &  Todd, 
1990),  affine  structure  (Reichel  &  Todd,  1991, 1992),  and  conformal  structure  (Mingolla  & 

Todd,  1989).  We  have  also  performed  numerous  experiments  to  identify  the  processing 
mechanisms  with  which  these  different  properties  are  perceptually  analyzed.  These  include 
both  masking  and  reaction  time  paradigms  to  asses  the  build-up  of  perceptual  knowledge  over 
time,  and  a  delayed  partial  report  paradigm  to  measure  its  decay  in  short  term  memory. 
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Contour  Identification 

Almost  all  computational  analyses  for  determining  3D  structure  from  visual  images  are 
designed  to  operate  on  abrupt  changes  in  image  structure  over  space  called  optical  contours. 

A  serious  theoretical  problem  that  is  seldom  mentioned  in  this  context  is  that  optical  contours 
can  arise  due  to  a  wide  variety  of  physical  phenomena  (e.g.,  smooth  occlusions,  specular 
highlights,  or  abrupt  changes  in  surface  orientation,  reflectance  or  illumination).  It  is  important 
to  keep  in  mind,  however,  that  most  computational  analyses  can  only  be  used  with  a  particular 
contour  type.  Thus,  since  contour  identification  seems  to  be  a  necessary  prerequisite  for  the 
computational  analysis  of  3D  form,  Todd  &  Schnittman  (1991)  have  recently  performed  a 
series  of  experiments  to  measure  the  minimal  amount  of  contextual  information  needed  to 
identify  different  types  of  contours  in  digitized  images  of  natural  objects.  The  results 
demonstrate  the  observers  can  reliably  identify  an  optical  contour  in  a  few  hundred 
milliseconds  even  when  there  is  insufficient  information  in  a  scene  to  reliably  identify  any  given 
object.  We  are  currently  attempting  to  analyze  these  reduced  context  images  in  an  effort  to 
discover  which  specific  optical  properties  are  used  to  discriminate  between  different  types  of 
contours. 

Another  important  theoretical  issue  that  needs  to  be  considered  in  this  context  is  that 
different  types  of  contours  produce  different  types  of  optical  deformation  when  objects  are 
viewed  stereoscopically  or  in  motion.  Existing  models  for  the  computational  analysis  of 
structure  from  motion  or  stereopsis  require  that  observed  objects  contain  identifiable 
landmarks  that  can  be  tracked  over  time.  There  are  many  other  types  of  commonly  observed 
events  ,  however,  such  as  the  deformations  of  shadows  or  smooth  occlusion  contours,  for 
which  these  analyses  are  inadequate  -  even  as  a  local  approximation.  Human  observers,  in 
contrast,  are  able  to  interpret  these  events  correctly.  Todd,  Norman  &  Fukuda  (1992)  have 
demonstrated  that  observers  can  reliably  discriminate  rigid  from  nonrigid  motion  when  all  that 
is  visible  is  the  optical  deformation  of  its  silhouette  or  its  cast  shadow  on  a  background  surface. 
A  primary  focus  of  our  ongoing  research  is  to  discover  how  this  is  accomplished. 
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